19 research outputs found

    Reinforcement learning in populations of spiking neurons

    Get PDF
    Population coding is widely regarded as a key mechanism for achieving reliable behavioral responses in the face of neuronal variability. But in standard reinforcement learning a flip-side becomes apparent. Learning slows down with increasing population size since the global reinforcement becomes less and less related to the performance of any single neuron. We show that, in contrast, learning speeds up with increasing population size if feedback about the populationresponse modulates synaptic plasticity in addition to global reinforcement. The two feedback signals (reinforcement and population-response signal) can be encoded by ambient neurotransmitter concentrations which vary slowly, yielding a fully online plasticity rule where the learning of a stimulus is interleaved with the processing of the subsequent one. The assumption of a single additional feedback mechanism therefore reconciles biological plausibility with efficient learning

    Schematic model of orientation selectivity by latency processing.

    No full text
    <p>(<b>A</b>) In this model, a population of 7 RGCs is stimulated by a sudden appearance of a bright/dark grating, and the resulting spike trains are processed by a tempotron. To emulate a cell that detects a single horizontally oriented pattern, reminiscent of a cortical simple cell, the tempotron should fire to the preferred grating (left), but remain silent to its inverse (middle), a rotated version (right), or any other pattern of illumination. To detect a horizontal grating independent of polarity, the tempotron should fire both to the preferred grating (left) and its inverse (middle), but reject all other patterns. (<b>B</b>) A set of synaptic weights assigned to the 7 RGCs (left) that solves this problem. Each RGC fires a spike either early or late (if its receptive field turns dark or bright, respectively) with a relative time difference of (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0053063#pone-0053063-g001" target="_blank">Figure 1H</a>). The resulting spike patterns produced by 4 different stimuli (top) are shown, with colors indicating each spike’s excitatory or inhibitory contribution. Bottom panels show the postsynaptic voltage traces elicited in the tempotron (). All 126 binary stimulus patterns other than the preferred grating and its inverse produce a peak voltage of in units of the unitary PSP amplitudes. The preferred grating and its inverse produce the two highest values with ; in the present model, this occurs if . Because of the order of excitation and inhibition, the preferred grating always elicits a higher peak voltage than the inverse grating. Hence, if the spike threshold is high (green line) the tempotron detects a single pattern, if is lower (pink line) it detects horizontal gratings of both polarities.</p

    Effects of spike-time noise, threshold noise, and readout time on tempotron performance.

    No full text
    <p>(<b>A</b>) Scatterplot of first-spike latencies of two RGCs on multiple trials for each of the eight stimulus phases in the highest contrast condition. For this cell pair, the latencies covary, with an average correlation coefficient of 0.46 over all eight grating phases. (<b>B</b>) Histogram of correlation coefficients for first-spike latencies observed for all simultaneously recorded cell pairs, stimuli, and contrasts (black). Note the excess of positive correlations. As a control, the gray line shows the analogous histogram obtained when correlating latencies of the two cells separated by one stimulus trial. (<b>C</b>) Top, histogram of relative latencies for the cell pair of (<b>A</b>) for the boundary task of <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0053063#pone-0053063-g004" target="_blank">Figure 4C</a>2. Bottom, the same histogram of relative latencies, but obtained from shifted trials. Note the increased dispersion of the relative latencies and the increased overlap between the red and blue peaks. (<b>D</b>) Effect of latency correlations on readout performance. For all simultaneously recorded cell pairs, we obtained the minimal tempotron error rates with inputs from simultaneous trials and with inputs from shifted trials. The ratio of these two error rates is plotted against the error rate obtained for simultaneous trials. Squares: luminance task; circles: boundary task. Symbol colors represent the mean latency correlation across the four stimuli that constitute a given task (color bar). Note that most of the points lie below unity, showing that in most cases the readout performance degrades when trials are shifted and latency correlations destroyed. (<b>E</b>) Distributions of the peak voltage of the tempotron for the boundary task, based on the distributions of relative latencies shown in (<b>C</b>) (top; same color code). The two voltage distributions result from optimizing the tempotron weights for different PSP kinetics, namely (top) and (bottom). In this example the shorter PSPs generate a much larger separation between the maximal voltages for target and null stimuli (red vs blue), such that the spike readout would be more robust to any noise in the neuron’s threshold. (<b>F</b>) Optimal classification performance on the boundary task with the input cell pair of (<b>C</b>) and assuming a Gaussian threshold noise whose standard deviation is 5% of the mean synaptic weight magnitude. The error was minimized for each PSP time constant (x-axis) over the synaptic efficacies for the purely excitatory solution (black, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0053063#pone-0053063-g004" target="_blank">Figure 4D</a>2) and the mixed solution with one excitatory and inhibitory input (gray, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0053063#pone-0053063-g004" target="_blank">Figure 4E</a>2). (<b>G</b>) Performance of optimal tempotrons operating with a pair of RGCs on the luminance task as a function of the maximal allowed latency, , of input spikes. The fraction of correct classifications was averaged over all input cell pairs that allowed for error-free performance in the highest contrast condition at large . Results are plotted for different values of the stimulus contrast (indicated in %). (<b>H</b>) As in (<b>G</b>) but for the boundary task and the fraction of correct classifications averaged over all input cell pairs with errors below 5% in the highest contrast condition at large .</p
    corecore